AITopics | local density

Collaborating Authors

local density

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Enabling AI Scientists to Recognize Innovation: A Domain-Agnostic Algorithm for Assessing Novelty

Wang, Yao, Cui, Mingxuan, Jiang, Arthur

arXiv.org Artificial IntelligenceMar-10-2025

In the pursuit of Artificial General Intelligence (AGI), automating the generation and evaluation of novel research ideas is a key challenge in AI-driven scientific discovery. This paper presents Relative Neighbor Density (RND), a domain-agnostic algorithm for novelty assessment in research ideas that overcomes the limitations of existing approaches by comparing an idea's local density with its adjacent neighbors' densities. We first developed a scalable methodology to create test set without expert labeling, addressing a fundamental challenge in novelty assessment. Using these test sets, we demonstrate that our RND algorithm achieves state-of-the-art (SOTA) performance in computer science (AUROC=0.820) and biomedical research (AUROC=0.765) domains. Most significantly, while SOTA models like Sonnet-3.7 and existing metrics show domain-specific performance degradation, RND maintains consistent accuracies across domains by its domain-invariant property, outperforming all benchmarks by a substantial margin (0.795 v.s. 0.597) on cross-domain evaluation. These results validate RND as a generalizable solution for automated novelty assessment in scientific research.

algorithm, enabling ai scientist, novelty, (14 more...)

arXiv.org Artificial Intelligence

2503.01508

Country:

North America > United States (0.04)
Europe > Italy (0.04)
Asia > China > Hubei Province > Wuhan (0.04)
Africa > Sub-Saharan Africa (0.04)

Genre:

Research Report > Strength High (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Research Report > Promising Solution (0.68)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
(9 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(5 more...)

Add feedback

Efficient Sparsification of Simplicial Complexes via Local Densities of States

Savostianov, Anton, Schaub, Michael T., Guglielmi, Nicola, Tudisco, Francesco

arXiv.org Machine LearningFeb-11-2025

Simplicial complexes (SCs), a generalization of graph models for relational data that account for higher-order relations between data items, have become a popular abstraction for analyzing complex data using tools from topological data analysis or topological signal processing. However, the analysis of many real-world datasets leads to dense SCs with a large number of higher-order interactions. Unfortunately, analyzing such large SCs often has a prohibitive cost in terms of computation time and memory consumption. The sparsification of such complexes, i.e., the approximation of an original SC with a sparser simplicial complex with only a log-linear number of high-order simplices while maintaining a spectrum close to the original SC, is of broad interest. In this work, we develop a novel method for a probabilistic sparsifaction of SCs. At its core lies the efficient computation of sparsifying sampling probability through local densities of states as functional descriptors of the spectral information. To avoid pathological structures in the spectrum of the corresponding Hodge Laplacian operators, we suggest a "kernel-ignoring" decomposition for approximating the sampling probability; additionally, we exploit error estimates to show asymptotically prevailing algorithmic complexity of the developed method. The performance of the framework is demonstrated on the family of Vietoris--Rips filtered simplicial complexes.

artificial intelligence, machine learning, simplicial complex, (16 more...)

arXiv.org Machine Learning

2502.07558

Country:

Europe > Germany > North Rhine-Westphalia > Cologne Region > Aachen (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Switzerland > Geneva > Geneva (0.04)
Europe > Italy > Abruzzo > L'Aquila Province > L'Aquila (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.87)

Add feedback

Detecting outliers by clustering algorithms

Li, Qi, Wang, Shuliang

arXiv.org Machine LearningDec-7-2024

Clustering and outlier detection are two important tasks in data mining. Outliers frequently interfere with clustering algorithms to determine the similarity between objects, resulting in unreliable clustering results. Currently, only a few clustering algorithms (e.g., DBSCAN) have the ability to detect outliers to eliminate interference. For other clustering algorithms, it is tedious to introduce another outlier detection task to eliminate outliers before each clustering process. Obviously, how to equip more clustering algorithms with outlier detection ability is very meaningful. Although a common strategy allows clustering algorithms to detect outliers based on the distance between objects and clusters, it is contradictory to improving the performance of clustering algorithms on the datasets with outliers. In this paper, we propose a novel outlier detection approach, called ODAR, for clustering. ODAR maps outliers and normal objects into two separated clusters by feature transformation. As a result, any clustering algorithm can detect outliers by identifying clusters. Experiments show that ODAR is robust to diverse datasets. Compared with baseline methods, the clustering algorithms achieve the best on 7 out of 10 datasets with the help of ODAR, with at least 5% improvement in accuracy.

algorithm, odar, outlier, (16 more...)

arXiv.org Machine Learning

2412.05669

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

User Feedback and Sample Weighting for Ill-Conditioned Hand-Eye Calibration

Horn, Markus, Wodtko, Thomas, Buchholz, Michael, Dietmayer, Klaus

arXiv.org Artificial IntelligenceAug-11-2023

Hand-eye calibration is an important and extensively researched method for calibrating rigidly coupled sensors, solely based on estimates of their motion. Due to the geometric structure of this problem, at least two motion estimates with non-parallel rotation axes are required for a unique solution. If the majority of rotation axes are almost parallel, the resulting optimization problem is ill-conditioned. In this paper, we propose an approach to automatically weight the motion samples of such an ill-conditioned optimization problem for improving the conditioning. The sample weights are chosen in relation to the local density of all available rotation axes. Furthermore, we present an approach for estimating the sensitivity and conditioning of the cost function, separated into the translation and the rotation part. This information can be employed as user feedback when recording the calibration data to prevent ill-conditioning in advance. We evaluate and compare our approach on artificially augmented data from the KITTI odometry dataset.

artificial intelligence, optimization problem, rotation axis, (16 more...)

arXiv.org Artificial Intelligence

2308.06045

Country:

North America > United States > California > Alameda County > Oakland (0.04)
Europe > Germany > Baden-Württemberg (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.70)

Add feedback

On Manifold Learning in Plato's Cave: Remarks on Manifold Learning and Physical Phenomena

Lederman, Roy R., Toader, Bogdan

arXiv.org Artificial IntelligenceJun-30-2023

Many techniques in machine learning attempt explicitly or implicitly to infer a low-dimensional manifold structure of an underlying physical phenomenon from measurements without an explicit model of the phenomenon or the measurement apparatus. This paper presents a cautionary tale regarding the discrepancy between the geometry of measurements and the geometry of the underlying phenomenon in a benign setting. The deformation in the metric illustrated in this paper is mathematically straightforward and unavoidable in the general case, and it is only one of several similar effects. While this is not always problematic, we provide an example of an arguably standard and harmless data processing procedure where this effect leads to an incorrect answer to a seemingly simple question. Although we focus on manifold learning, these issues apply broadly to dimensionality reduction and unsupervised learning.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2304.14248

Country: Europe > United Kingdom (0.04)

Genre: Research Report (0.50)

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Preserving local densities in low-dimensional embeddings

Fischer, Jonas, Burkholz, Rebekka, Vreeken, Jilles

arXiv.org Artificial IntelligenceJan-31-2023

Low-dimensional embeddings and visualizations are an indispensable tool for analysis of high-dimensional data. State-of-the-art methods, such as tSNE and UMAP, excel in unveiling local structures hidden in high-dimensional data and are therefore routinely applied in standard analysis pipelines in biology. We show, however, that these methods fail to reconstruct local properties, such as relative differences in densities (Fig. 1) and that apparent differences in cluster size can arise from computational artifact caused by differing sample sizes (Fig. 2). Providing a theoretical analysis of this issue, we then suggest dtSNE, which approximately conserves local densities. In an extensive study on synthetic benchmark and real world data comparing against five state-of-the-art methods, we empirically show that dtSNE provides similar global reconstruction, but yields much more accurate depictions of local distances and relative densities.

artificial intelligence, machine learning, sne, (17 more...)

arXiv.org Artificial Intelligence

2301.13732

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > Germany > Saarland > Saarbrücken (0.04)

Genre: Research Report > Promising Solution (0.54)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Determinate Node Selection for Semi-supervised Classification Oriented Graph Convolutional Networks

Xiao, Yao, Xu, Ji, Yang, Jing, Li, Shaobo

arXiv.org Artificial IntelligenceJan-11-2023

Graph Convolutional Networks (GCNs) have been proved successful in the field of semi-supervised node classification by extracting structural information from graph data. However, the random selection of labeled nodes used by GCNs may lead to unstable generalization performance of GCNs. In this paper, we propose an efficient method for the deterministic selection of labeled nodes: the Determinate Node Selection (DNS) algorithm. The DNS algorithm identifies two categories of representative nodes in the graph: typical nodes and divergent nodes. These labeled nodes are selected by exploring the structure of the graph and determining the ability of the nodes to represent the distribution of data within the graph. The DNS algorithm can be applied quite simply on a wide range of semi-supervised graph neural network models for node classification tasks. Through extensive experimentation, we have demonstrated that the incorporation of the DNS algorithm leads to a remarkable improvement in the average accuracy of the model and a significant decrease in the standard deviation, as compared to the original method.

artificial intelligence, machine learning, node, (14 more...)

arXiv.org Artificial Intelligence

2301.04381

Country: Asia > China (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Leisure & Entertainment > Sports (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

A density peaks clustering algorithm with sparse search and K-d tree

Shan, Yunxiao, Li, Shu, Li, Fuxiang, Cui, Yuxin, Li, Shuai, Zhou, Ming, Li, Xiang

arXiv.org Artificial IntelligenceJul-19-2022

Density peaks clustering has become a nova of clustering algorithm because of its simplicity and practicality. However, there is one main drawback: it is time-consuming due to its high computational complexity. Herein, a density peaks clustering algorithm with sparse search and K-d tree is developed to solve this problem. Firstly, a sparse distance matrix is calculated by using K-d tree to replace the original full rank distance matrix, so as to accelerate the calculation of local density. Secondly, a sparse search strategy is proposed to accelerate the computation of relative-separation with the intersection between the set of $k$ nearest neighbors and the set consisting of the data points with larger local density for any data point. Furthermore, a second-order difference method for decision values is adopted to determine the cluster centers adaptively. Finally, experiments are carried out on datasets with different distribution characteristics, by comparing with other six state-of-the-art clustering algorithms. It is proved that the algorithm can effectively reduce the computational complexity of the original DPC from $O(n^2K)$ to $O(n(n^{1-1/K}+k))$. Especially for larger datasets, the efficiency is elevated more remarkably. Moreover, the clustering accuracy is also improved to a certain extent. Therefore, it can be concluded that the overall performance of the newly proposed algorithm is excellent.

algorithm, cluster center, dataset, (12 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ACCESS.2022.3190958

2203.00973

Country:

North America > United States > California > Orange County > Irvine (0.14)
Asia > China > Heilongjiang Province > Harbin (0.05)
Asia > China > Tianjin Province > Tianjin (0.04)
(4 more...)

Genre: Research Report (0.40)

Industry:

Information Technology (0.46)
Education (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Outlier/Anomalies Detection Using Unsupervised Machine Learning

#artificialintelligenceJul-17-2022, 12:55:25 GMT

PyOD is one such library to detect outliers in your data. It provides access to more than 20 different algorithms to detect outliers and is compatible with both Python 2 and 3. In the case of outlier detection, the algorithm is used differently. Since we do not know the outliers in advance, KNN is used in an unsupervised learning manner. In this scenario, the algorithm finds the closest K nearest neighbors for every data point and measures the average distance.

artificial intelligence, data mining, machine learning, (19 more...)

#artificialintelligence

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

VDPC: Variational Density Peak Clustering Algorithm

Wang, Yizhang, Wang, Di, Zhou, You, Zhang, Xiaofeng, Quek, Chai

arXiv.org Artificial IntelligenceDec-29-2021

The widely applied density peak clustering (DPC) algorithm makes an intuitive cluster formation assumption that cluster centers are often surrounded by data points with lower local density and far away from other data points with higher local density. However, this assumption suffers from one limitation that it is often problematic when identifying clusters with lower density because they might be easily merged into other clusters with higher density. As a result, DPC may not be able to identify clusters with variational density. To address this issue, we propose a variational density peak clustering (VDPC) algorithm, which is designed to systematically and autonomously perform the clustering task on datasets with various types of density distributions. Specifically, we first propose a novel method to identify the representatives among all data points and construct initial clusters based on the identified representatives for further analysis of the clusters' property. Furthermore, we divide all data points into different levels according to their local density and propose a unified clustering framework by combining the advantages of both DPC and DBSCAN. Thus, all the identified initial clusters spreading across different density levels are systematically processed to form the final clusters. To evaluate the effectiveness of the proposed VDPC algorithm, we conduct extensive experiments using 20 datasets including eight synthetic, six real-world and six image datasets. The experimental results show that VDPC outperforms two classical algorithms (i.e., DPC and DBSCAN) and four state-of-the-art extended DPC algorithms.

algorithm, cluster center, dataset, (15 more...)

arXiv.org Artificial Intelligence

2201.00641

Country:

Asia > Singapore (0.04)
Asia > China > Jilin Province > Changchun (0.04)
Asia > China > Jiangsu Province (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback